There are a lot of great tools for defining and applying color palettes for use with ggplot in R.
These example uses the following packages:
library(tidyverse)
library(sf)
library(MetBrewer)
library(RColorBrewer)
library(ggthemes)
I’ll start with data on counties in the contiguous United States. I’ve joined it with a spreadsheet of Urban-Rural classifications from the United States Department of Agriculture so that I can demonstrate with a categorical variable as well. Here are the first few rows:
| NAMELSAD10 | Description | pct_ua | geometry |
|---|---|---|---|
| Panola County | Nonmetro - Urban population of 2,500 to 19,999, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-90.13476 3… |
| Newton County | Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-89.13497 3… |
| Coahoma County | Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-90.59063 3… |
| Madison Parish | Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-91.03511 3… |
| Charlottesville city | Metro - Counties in metro areas of fewer than 250,000 population | 1.00 | MULTIPOLYGON (((-78.47071 3… |
| Alexandria city | Metro - Counties in metro areas of 1 million population or more | 1.00 | MULTIPOLYGON (((-77.06247 3… |
| Buena Vista city | Nonmetro - Urban population of 2,500 to 19,999, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-79.36681 3… |
| Fairfax city | Metro - Counties in metro areas of 1 million population or more | 1.00 | MULTIPOLYGON (((-77.31427 3… |
| Dallas County | Metro - Counties in metro areas of 1 million population or more | 0.99 | MULTIPOLYGON (((-96.52999 3… |
| Howard County | Nonmetro - Urban population of 20,000 or more, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-101.1747 3… |
| Foard County | Nonmetro - Completely rural or less than 2,500 urban population, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-99.59622 3… |
| Floyd County | Nonmetro - Urban population of 2,500 to 19,999, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-101.5645 3… |
| Brewster County | Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-103.3254 3… |
| Franklin County | Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-95.30845 3… |
| Hunt County | Metro - Counties in metro areas of 1 million population or more | 0.00 | MULTIPOLYGON (((-95.86178 3… |
| Newton County | Metro - Counties in metro areas of 250,000 to 1 million population | 0.00 | MULTIPOLYGON (((-93.55074 3… |
| Starr County | Nonmetro - Urban population of 20,000 or more, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-98.85127 2… |
| Kinney County | Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-100.1122 2… |
| Collingsworth County | Nonmetro - Completely rural or less than 2,500 urban population, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-100.5396 3… |
| Tom Green County | Metro - Counties in metro areas of fewer than 250,000 population | 0.84 | MULTIPOLYGON (((-100.4416 3… |
| Coleman County | Nonmetro - Urban population of 2,500 to 19,999, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-99.71619 3… |
| Brazoria County | Metro - Counties in metro areas of 1 million population or more | 0.72 | MULTIPOLYGON (((-95.05714 2… |
| Leon County | Nonmetro - Completely rural or less than 2,500 urban population, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-96.26059 3… |
| Duval County | Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area | 0.00 | MULTIPOLYGON (((-98.79923 2… |
| Lamb County | Nonmetro - Urban population of 2,500 to 19,999, adjacent to a metro area | 0.00 | MULTIPOLYGON (((-102.401 33… |
Here is a cholorpleth map showing the variation in the percentage of each county’s population that lives in a census tract that is classified as being in an urbanized area. By default, continuous variables are represented by shades of blue.
continuous_map <- ggplot(counties) +
geom_sf(aes(fill = pct_ua)) +
coord_sf(crs = 5070) +
theme_map()
continuous_map
Here is a cholorpleth map showing the urban-rural classification categories. The default for categorical variables is to select colors that are evenly spaced around a color ramp.
categorical_map <- ggplot(counties) +
geom_sf(aes(fill = Description)) +
coord_sf(crs = 5070) +
theme_void() +
theme(legend.position = "bottom",
legend.direction = "vertical")
categorical_map
Before you can get too far into customizing colors on R plots, it can
be useful to know how to refer to specific colors in your code. There
are hundreds of different colors you can refer to by name in R, as shown
below. In addition to these, you can refer to a shade of gray with the
with the word gray (or grey) followed by a
number ranging from 0 to 100 (e.g. gray30 or
gray80). If you leave off the number (just
gray), you’ll get the same thing as
gray75.
You can also refer to colors by their RGB (red-green-blue) hex codes.
These will be six-digit strings preceded by a #. There are
a number of web-based tools for finding a color’s hex code, like this one.
For a categorical variable, I can define a color scale using the
scale_fill_manual() function (or
scale_color_manual() for points and lines). Again, I can
reference colors by their names:
categorical_map +
scale_fill_manual(values = c("white",
"gray",
"black",
"red",
"orange",
"yellow",
"green",
"blue",
"purple"))
Or by hex codes:
categorical_map +
scale_fill_manual(values = c("#ffffff",
"#999999",
"#000000",
"#ff0000",
"#ff9900",
"#ffff33",
"#00ff00",
"#0000ff",
"#9933ff"))
For a continuous variable, I can set up a gradient.
scale_fill_gradient() will let you set up a gradient from
one color to another.
continuous_map +
scale_fill_gradient(low = "blue", high = "red")
scale_fill_gradient2 will let you set up a gradient from
one color, through a midpoint, to another color. Helpfully, you can
specify the value of your midpoint if you don’t want it to be the middle
of the range. Here are three plots of the same variable with different
values for the midpoint on the color ramp.
This one places the midpoint in the middle of the range.
continuous_map +
scale_fill_gradient2(low = "green", mid = "yellow", high = "red",
midpoint = 0.5)
This one has a midpoint in the higher part of the range.
continuous_map +
scale_fill_gradient2(low = "green", mid = "yellow", high = "red",
midpoint = 0.75)
This one has a midpoint in the lower part of the range.
continuous_map +
scale_fill_gradient2(low = "green", mid = "yellow", high = "red",
midpoint = 0.25)
scale_fill_gradientn will let you create a color ramp
that includes more than three colors.
continuous_map +
scale_fill_gradientn(colors = c("white",
"gray",
"black",
"red",
"orange",
"yellow",
"green",
"blue",
"purple"))
Manually defining the colors in a palette can be tedious, and you’ll often end up with ugly color palettes. Luckily, there are several predefined color palettes you can use.
The viridis color ramps are meant to be colorblind friendly and to reproduce reasonbly well in grayscale (for example, if someone prints your graphic then photocopies it with a black-and-white printer). There are 4 options.
Option A:
continuous_map +
scale_fill_viridis_c(option = "A")
Option B:
continuous_map +
scale_fill_viridis_c(option = "B")
Option C:
continuous_map +
scale_fill_viridis_c(option = "C")
Option D (this is the default if you don’t specify an option):
continuous_map +
scale_fill_viridis_c(option = "D")
You can reverse the color ramp values by setting
begin = 1, end = 0.
continuous_map +
scale_fill_viridis_c(option = "D",
begin = 1, end = 0)
You can also use the viridis color palette for discrete variables:
categorical_map +
scale_fill_viridis_d(option = "A")
The RColorBrewer package offers a bunch of pre-defined
palettes that can be useful for continuous and categorical data. You can
view the options with the function `display.brewer.all()
display.brewer.all()
The top group of palettes you see above are sequential palettes, which range from dark to light colors.
To use a Color Brewer palette, you first set up the palette with the
brewer.pal() function, indicating the number of colors you
want and the palette to draw them from.
cat_palette <- brewer.pal(9, "Pastel1")
cont_palette <- brewer.pal(5, "Spectral")
Then you can apply that palette using
scale_fill_manual() for categorical variables.
categorical_map +
scale_fill_manual(values = cat_palette)
Or scale_fill_gradientn() for categorical variables.
continuous_map +
scale_fill_gradientn(colors = cont_palette)
As with the viridis functions, you can reverse the directions of a
color ramp. You can do this by calling the rev() fuction on
the color ramp you’ve defined.
continuous_map +
scale_fill_gradientn(colors = rev(cont_palette))
There are some other fun package that define color palettes, including wesanderson (colors inspired by Wes Anderson films), PNWColors (colors inspired by the package author’s photos of the Pacific Northwest), MetBrewer (inspired by works at the Metropolitan Museum of Art), and MexBrewer (inspired by works of Mexican muralists). These four packages are structured similarly. Some of the available palettes don’t work well for continuous variables (the order of the colors isn’t intuitive), but some do. Spend some time experimenting.
degas_contin <- met.brewer(name = "Degas", n = 5, type = "continuous")
continuous_map +
scale_fill_gradientn(colors = degas_contin)
Redon_cat <- met.brewer(name = "Redon", n = 9, type = "discrete")
categorical_map +
scale_fill_manual(values = Redon_cat)